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Abstract. This paper discusses our initial experience with introduc- 
ing automated assume-guarantee verification based on learning in the 
SPIN tool. We believe that compositional verification techniques such as 
assume-guarantee reasoning could complement the state-reduction tech- 
niques that SPIN already supports, thus increasing the size of systems 
that SPIN can handle. We present a “light-weight” approach to evalu- 
ating the benefits of learning-based assume-guarantee reasoning in the 
context of SPIN: we turn our previous implementation of learning for the 
LTSA tool into a main program that externally invokes SPIN to provide 
the model checking-related answers. Despite its performance overheads 
(which mandate a future implementation within SPIN itself), this ap- 
proach provides accurate information about the savings in memory. We 
have experimented with several versions of learning-based assume guar- 
antee reasoning, including a novel heuristic introduced here for generat- 
ing component assumptions when their environment is unavailable. We 
illustrate the benefits of learning-based assume-guarantee reasoning in 
SPIN through the example of a resource arbiter for a spacecraft. 
Keywords: assume-guarantee reasoning, model checking, learning 


1 Introduction 

This paper describes work performed in the context of the Reliable Software Sys- 
tems Development (RSSD) project headed by the NASA Jet Propulsion Labora- 
tory (JPL). The aim of RSSD is to improve the reliability and safety of software 
systems to support human and robotic exploration of space. The emphasis is 
on tool support for the development of verifiable software - tools will be appli- 
cable at all stages of the software development, and will target the C language 
for implementation. For design, the tool that will be supported is SPIN [18] for 
the following two main reasons. SPIN has been used extensively and success- 
fully for industrial applications. Moreover, SPIN enables embedding of C code, 
which allows to combine designs with implementations. The users of the tool are 
thus offered the convenience of using a single environment for verification when 
transitioning between different phases of the software development. 

We present here a component of this project that is a collaborative effort 
between JPL and NASA Ames, and which aims at investigating whether/how 
compositional techniques can benefit SPIN in dealing with software designs. The 
compositional techniques that we investigate axe based on automated assump- 
tion generation for assume-guarantee reasoning, as presented in [5,10,13]. The 
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techniques were implemented in the LTSA tool [1] for the analysis of design 
models encoded as finite-state labelled transition systems with blocking com- 
munication, Although these techniques have proven effective in the LTSA [25], 
there is no guarantee that they will be (as) successful in the context of other 
model checkers. For example, as seen in [14], the savings obtained with auto- 
mated assume-guarantee reasoning at the design level with LTSA were more 
pronounced than those obtained at the ( Java) code level with Java PathFinder 
[30]. The LTSA is by nature a compositional tool, which means that any com- 
ponent in isolation can be targeted for analysis, without the need to provide 
an environment to turn it into a “closed” system, which is the case for a Java 
component. Moreover, the amount of detail at the code level makes state spaces 
larger and may “hide” the size of the savings obtained from a particular ap- 
proach. SPIN lies somewhere in between the two tools: SPIN’S input language - 
Promela - is closer to a programming language than the input language of the 
LTSA, but SPIN is a design as opposed to code-level tool. 

The work reported here is a first study of the issues and benefits of introduc- 
ing compositional techniques into SPIN (which is not inherently a compositional 
tool). Our approach has been to make such an evaluation in a “fight- weight” 
fashion, that is, to avoid re-implementing our algorithms within SPIN itself. We 
will describe how we turned our existing implementations into a main program 
that invokes SPIN to provide answers to specific model checking questions. As 
will be discussed later, such an approach has a number of disadvantages, as for 
example high time overheads. However, we claim that it provides a good way for 
researchers to make a quick evaluation of the potential benefits of compositional 
techniques in their model-checking environment. After all, the main interest in 
model checking is to obtain savings in memory, and these can be evaluated ac- 
curately with the framework that we propose. 

We will discuss the technical details involved in the implementation of our 
“light-weight” compositional framework for SPIN. For simplicity, we only look 
into Promela programs where components communicate in a “rendez-vous” fash- 
ion (i.e., Promela channels of size 0). Note that our evaluations will also include a 
novel heuristic presented in this paper for generating component interface speci- 
fications using learning. The description of our approach will be given in terms of 
a running example of a client-server system. We will then discuss the application 
of our techniques to the larger case study of a resource arbiter for a spacecraft, 
where learning-based assume-guarantee reasoning achieved significant memory 
gains. 

To summarize, the contributions of this paper are: 1) an approach for fast and 
easy evaluation of the benefits that compositional verification techniques based 
on learning can bring in the context- of any model checker, 2) a description of the 
technical details involved in the implementation of this approach in SPIN, and 
3) the discussion of a novel heuristic for learning assumptions of components in 
isolation, and 4) the application of our approach to a realistic resource arbiter 
for a spacecraft, for which it achieved significant memory gains over traditional 
monolithic model checking. 
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The remainder of the paper is organized as follows. We give background 
on assume-guarantee reasoning and learning in Section 2. A description of our 
proposed approach is provided in Section 3, with the technical details of its im- 
plementation in SPIN presented in Section 4. Section 5 discusses our experience 
with applying our approach to the resource arbiter case study. Finally, Section 6 
presents related work and Section 7 concludes the paper. 

2 Background 

2.1 Assume Guarantee Reasoning 

We address the problem of checking designs using model checking. We use com- 
positional techniques for increased scalability. For simplicity, let us consider two 
software components M x and M 2 (represented as finite state labeled transition 
systems) and a safety property P (expressed as a finite state automaton). Rea- 
soning about more than two components will be discussed later in Section 3. 

The goal is to check if the two components operate correctly together to 
achieve the desired property, i.e. to check M x \\M 2 (== P using model check- 
ing techniques. Here, the parallel composition operator || denotes the product 
construction for finite state automata, where the behavior of two components is 
combined by synchronization of common actions and interleaving of remainig ac- 
tions. Property P encodes the desired interactions between components. Check- 
ing Mi\\Iv ±2 j= P directly may be too expensive (there may not be enough time 
and memory resources to complete the computation), so we break-up the verifi- 
cation into two smaller sub-problems, i.e. we check Mi and M 2 separately, using 
assume-guarantee reasoning . 

In the assume-guarantee paradigm a formula is a triple (A) M (P), where M 
is a component, P is a property, and A is an assumption about M’s environment. 
The formula is true if whenever M is part of a system satisfying A , then the 
system must also guarantee P. 

The simplest assume-guarantee proof rule shows that if (A) M\ (P) and 
{true) M 2 {A) hold, then {true) M\ || M 2 (P) also holds. This proof strategy 
can also be expressed as an inference rule as follows: 

(Premise 1) {A) M x (P) 

(Premise 2) {true) M 2 (A) 

~~~ -|| W 2 {P) 

Thus, using this rule we can show that P holds on M x || M 2 , by checking 
(A) Mi (P) and {true) M 2 (A) separately. More elaborate rules can be used 
for this style of reasoning [5]. The underlying aim for all such rules is to make 
model checking of their premises cheaper, in terms of time and in particular 
consumed memory, than non-compositional verification. To achieve this, the as- 
sumptions have to be much smaller than the analyzed components. Coming up 
with appropriate assumptions is traditionally a difficult, manual process. 
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Fig. 1. The L* learning algorithm 


In previous work we proposed to use an off-the-shelf learning algorithm, L*, to 
derive appropriate assumptions automatically . Initial approximate assumptions 
are gradually refined by means of learning from counterexample traces obtained 
by model checking assume guarantee triples. 


2.2 The L* Learning Algorithm 

The learning algorithm used by our approach was developed by Angluin and 
later improved by Rivest and Schapire. We refer to the improved version by the 
name of the original algorithm, L*. L* learns an unknown regular language and 
produces a DFA that accepts it - see Figure 1. Let U be an unknown regular 
language over some alphabet E. In order to learn U , L* needs to interact with a 
Minimally Adequate Teacher. The Teacher must be able to correctly answer two 
types of questions from L*. The first type is a membership query , consisting of 
a string s € E*; the answer is true if s € {7, and false otherwise. For the second 
type of question, the learning algorithm generates a conjecture , i.e. , a candidate 
DFA A whose language the algorithm believes to be identical to U. The answer 
is true if C (A) ••= U. Otherwise the Teacher returns a counterexample, which is 
a string s in the symmetric difference of C (A) and U. 

At a higher level, L* creates a table where it incrementally records whether 
strings in E* belong to U. It does this by making membership queries to the 
teacher. At various stages L* decides to make a conjecture. It constructs a can- 
didate automaton A based on the information contained in the table and asks 
the Teacher whether the conjecture is correct. If it is, the algorithm terminates. 
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Otherwise, L* uses the counterexample returned by the Teacher to extend the 
table with strings that witness differences between C (A) and U, 


Characteristics of L*. L* is guaranteed to terminate with a minimal au- 
tomaton A for the unknown language U. The conjectures made by L* strictly 
increase in size; each conjecture is smaller than the next one, and all incorrect 
conjectures are smaller than A. Therefore, if A has n states, L* makes at most 
n — 1 incorrect conjectures. The number of membership queries made by L* is 
O (kn 2 + n logm), where k is the size of the alphabet of U, n is the number of 
states in the minimal DFA for £/, and m is the length of the longest counterex- 
ample returned when a conjecture is made. 

3 Tool Architecture 

We present here an initial study for a tool-based approach to compositional 
verification, that uses the L* algorithm to build assumptions and the SPIN 
model checking tool to check assume guarantee triples. Although using learning 
to automate assume guarantee reasoning was introduced in our previous work, 
there are some novel ideas that we propose here: 

— We present a generic tool architecture that uses learning for automated 
assume guarantee reasoning for multiple components. By generic, we mean 
that the tool can be instantiated with different model checking tools for 
checking assume guarantee triples; we discuss the use of SPIN here. 

— The tool can be used for checking different assume guarantee rules (as be- 
fore), but in addition we present a novel heuristic that allows us to derive the 
interface specification for a component Mi , in the absence of a specification 
of its environment (i.e. M 2 ). This interface specification can be used to check 
if the component Mi behaves correctly in multiple contexts. In the past, we 
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have experimented with an approach that uses conformance checking [10]. 
Instead of using this expensive approach, we present a light-weight heuristic 
that enables cheaper generation of precise interface specifications. 

The architecture of the compositional verification tool is illustrated in Fig- 
ure 2. The architecture is derived from our previous work on compositional verifi- 
cation and the implementation of our algorithms in the context of the LTSA tool, 
both within the core of the LTSA [10], and as an LTSA plugin [1]. The goal is 
to use learning to derive an assumption A such that the assume guarantee triple 
(A) Mi ( P ) evaluates to true . The weakest assumption A w under which Mi sat- 
isfies P is such that, for any environment component Afe, (true) M\ |j Mb (P) if 
and only if (true) Mb (A w ). In our framework, L* attempts to build A w through 
iterative learning. For L* to learn A wy we need to provide a Teacher that is able 
to answer the two different kinds of questions that L* asks. Our approach uses 
model checking to implement such a Teacher. 


Membership Queries To answer a membership query for s the Teacher sim- 
ulates s to check if it may lead to a violation. For simplicity, our current im- 
plementation for SPIN reduces the simulation to model checking. If there is no 
violation, it means that s £ £ (A w ), because Mi does not violate P in the con- 
text of s, so the Teacher returns true. Otherwise, the answer to the membership 
query is false. 


Conjectures Our framework uses the conjectures returned by L* as interme- 
diate candidate assumptions Ai . The teacher uses two oracles : Oracle 1 guides 
L* towards a conjecture that is strong enough to make (A) M 1 (P) true. Once 
this is accomplished, the resulting conjecture may be too strong, in which case 
our framework uses Oracle 2 to guide L* towards a weaker conjecture. There are 
many options for implementing oracle 2 , and we discuss some of them below. 

Oracle 1 checks (Ai) Mi (P). If this does not hold, the model checker returns 
a counterexample. The Teacher informs L* that its conjecture Ai is not correct 
and provides the counterexample to witness this fact. If, instead, (Ai) M\ (P) 
holds, the Teacher forwards Ai to Oracle 2. 

Oracle 2 needs to ensure that the candidate assumption is indeed the weakest 
In the context of this work, we have implemented different versions for this oracle. 

- If component M 2 is available, then the oracle checks (true) M 2 (Ai) (as in 
our previous work) . If the result of model checking is true, the teacher returns 
true. Whether Ai represents the weakest assumption or not, our framework 
then terminates the verification because, according to the assume-guarantee 
compositional rule, P has been proved on Mi || M 2 . If model checking re- 
turns a counterexample, the oracle performs counterexample analysis. If the 
counterexample indicates a real error, the framework stops and the error is 
reported to the user. Alternatively, if the counterexample indicates that the 
current candidate assumption needs to be refined, it is returned to guide L*. 
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— We have extended our implementation to reasoning for n components Mi\\M 2 \\.„M n 
The system is decomposed into two parts Mi and M 2 "= M 2 ||...!|M n and the 
learning algorithm is invoked recursively for checking the second premise of 
the rule. 


3.1 Generation of Interface Specifications 

As discussed, Oracle 2 is responsible for ensuring that an assumption A , shown 
strong enough by Oracle 1, is not too strong. In other words, the assumption 
should include all traces over the alphabet of the assumption, in the context of 
which Mi satisfies the property P. By alphabet we mean the set of events that 
are involved in a state machine. 

We discuss here the case where the the alphabet of the property and the 
alphabet of the assumption are the same. We restrict ourselves to this case for 
simplicity, but also because it covers all the examples that we discuss in this 
paper. We are currently studying different cases and plan on extending this 
heuristic for those. 

Let Ta denote the set of all traces over the alphabet of the assumption 
A. Then A should include all traces in Ta that satisfy the property; if some 
trace t E Ta that satisfies P is not in A, then A is obviously too strong, so t 
must be returned to the learning algorithm for the assumption to be refined. 
The above check can be formulated as P f= A, and can be performed by a 
model checker, with the counterexamples returned to the learning algorithm. 
Our proposed heuristic for Oracle 2 for generating interface specifications is to 
therefore implement P fy A. 

Note that our heuristic is not always accurate, meaning that it may fail to 
report traces that the assumption does not include even though it should. The 
traces that it may miss are traces that violate P but that will never be exercised 
in the context of the component M x . These traces are the traces of !Mi[j!P, 
where \M% denotes the complement of Mi, and similarly for P. Computing the 
complement of Mi involves determinization, which may increase the state-space 
of Mi exponentially, in the worst ease. For this reason, we do not include this 
check in our heuristic. One may argue that many components do not exhibit this 
worst-case complexity. For such components, however, rather than computing 
!Mi||!P, it would make more sense to construct the assumption directly, using 
the algorithm presented in our previous work [13]. Learning was introduced in 
order to avoid the potential complexity of this computation [10]. 

It is worth mentioning that, although our heuristic as currently implemented 
may not always compute the weakest assumption, our experiments discussed 
later In the paper demonstrate that it is quite effective in practice. 

4 Implementation 

Our implementation makes use of our previous Java implementation of L*, but 
extended with support for analysis of multiple components through recursive 
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mtype = {ul, u2. Nobody}; 
chan request = [0] of {mtype}; 
chan cancel = [0] of {mtype}; 
chan grant = [0] of {mtype}; 
chan deny = [0] of {mtype}; 

active pro c type server () { 
mtype resUser = Nobody; 
mtype' u; 

SO: if 

request ?u -> 
if 

:: (resUser -= Nobody) -> grant !u; resUser = u; goto SO; 
:: else ~> deny!u; goto SO; 

fi; 

cancel ?u -> 
if 

: : (resUser == u) -> resUser = Nobody; goto SO ; 

:: else — > goto SO; 
fi; 
fi; 

} 


Fig. 3. Promela code for server 


invocation, and for the new heuristic for Oracle 2. The learning now runs as a 
stand-alone application that invokes SPIN (from within Java) to answer queries 
and conj eetur es . 

We consider here only a subset of Promela, where components are Promela 
processes that communicate through rendezvous channels. We consider safety 
properties that refer to the rendezvous communication between components. 
We leave for future work the extension of the approach to handling the full 
Promela language. We selected this subset of Promela because it bears a close 
correspondence to the type of models that we analyze in the context of LTSA. 
Moreover, several systems can be described in this subset. For example, the 
work presented in [11, 27] shows how abstracted Java and Ada programs can be 
translated into this exact subset of Promela. 

We illustrate the implementation on a simple Promela model for a client 
server application - see Figure 3 and Figure 4(left). The model has a server and 
two clients that communicate through global rendezvous channels. Note that the 
MER case study is a more complex version of this type of system. 

The clients send requests to make a reservation for using a common resource, 
they wait for the server to grant the reservation, they use the resource, after 
which they cancel the reservation. The server can grant or deny a request , such 
that the resource is used only by one client at a time. We analyzed a property 
stating that the resource shall be used mutually exclusive. 
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proctype client (mtype u) { 

Init : if 

: : request ! u 
fi; 

PendingEeservat ion : 
if 

: : grant ?eval (u) 

: : deny?eval (u) -> goto Init; 
fi; 

PendingCancel : 
if 

:: cancel !u -> goto Init 
fi; 


trace { 

QO: if 

: : grant ?u2 -> goto Q4; 

: : grant?ul -> goto Q5 ; 
fi; 

Q4: if 

: : cancel?u2 -> goto QO; 
fi; 

Q5: if 

: : cancel?ul -> goto QO; 

fi; 

> 


Fig. 4. Promela code for client (left) and mutual exclusion property (right) 


There are many ways of encoding (safety) properties in SPIN; i.e. as basic 
assertions, never claims or trace assertions [19]. We chose to encode properties as 
trace assertions : the types of safety properties that we typically encounter refer 
to valid sequences of channel operations, and trace assertions are specifically de- 
signed for formulating such sequences. In Section 5 we discuss other formalisms 
for encoding assume guarantee triples. Figure 4(right) shows the trace asser- 
tion for the mutual exclusion property. The assertion specifies the correctness 
requirement that receive operations on channel grant with ul and u2 alternate 
with receive operations on cancel with ul and u2, respectively. In other words, 
for mutual exclusion to be guaranteed, when a user is granted the resource, then 
this user needs to cancel it before it gets granted to a different user. The trace 
assertion defines an automaton that monitors the system execution (it changes 
state when a chanel operation that is within its scope is executed). 

In order to analyze this model using our learning based implementation, 
we first brake up the system into its components, i.e. processes client (ul), 
dient(u2) and server (). We also need to provide the alphabet of actions for the 
candidate assumptions. As discussed in Section 3.1, we set the alphabet of the 
assumptions to be the same as the alphabet of properties. 


Checking Assume Guarantee Triples In our approach, we use SPIN to 
answer queries and oracles, which are encoded as assume guarantee triples of 
the form (A) M (P). Here A denotes a deterministic finite state automaton that 
may encode traces (in the case of queries) or candidate assumptions generated 
by L*. Property P is a also a deterministic finite state automaton (encoded 
as a trace assertion). The assumptions define execution environments for the 
components under analysis. We therefore encode them as Promela processes that 
run in parallel with the analyzed components (and thus restrict their behavior). 
The assumption A and the property P are used to examine the component M 
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active proctype query () { 
grant ! ul ; 
grant ! u2 ; 

> 

active proctype Univer salEnv () f 
do /* actions unmatched in 171] |A */ 
:: request ?ul 
deny!ul 

/* actions of other users */ 

: : grant ?u2 
:: cancel !u2 
: : grant ?u3 
: : cancel !u3 
: : grant ?u4 
:: cancel !u4 
:: grant ?u5 
: : cancel ! u5 
od 


active proctype CandidateAssumptionO { 
QO: if 

:: grant !ul-> goto Q2; 

:: grant !u2-> goto Q3 ; 

: : cancel?ul-> goto Q1 ; 

fi; 

Ql: if 

:: grant !ul-> goto Ql; 

:: grant ! u2-> goto Ql; 

:: cancel?ul-> goto Ql; 

:: cancel?u2-> goto Ql; 
fi; 

Q2: if 

:: grant !ul-> goto Ql; 

: : cancel?ul-> goto QO; 
fi; 

Q3: if 

: : cancel?ul-> goto Ql ; 

:: cancel?u2-> goto QO; 

fi; 

} 


Fig. 5. Promela code for a query, an assumption and the universal environment 


and to check whether behaviors that are allowed by the assumption may lead to 
a property violation. 

To check an assume guarantee triple, the teacher first creates a file that en- 
codes the assumption as a Promela process and the property as a trace assertion, 
and it invokes SPIN i.e., it executes the following commands: 

spin -a Ml. promela 
cc -o pan pan.c -DSAFETY 
, /pan -E 


The teacher waits for the verification to complete and it parses the output of the 
verification process to check if there were any assertion violations, in which case 
it returns false (together with the counterexample reported by SPIN) to the L* 
algorithm; otherwise, it returns true . All these steps are automated. 

As an example, Figure 5 shows the Promela process for checking a query on 
component client (ul) for string “grant ! ul ; grant ! u2 ; ” . Figure 5 also shows 
the Promela process for a Candidat eAs sumpt ion for client (ul) . 

We should note that both properties and assumptions are global , i.e. they may 
refer to actions that are not local to the component under analysis. In order to 
check in isolation whether a component violates a global property, we need to 
provide an environment that substitutes the rest of the system, as typically per- 
formed in model checking. In the context of checking assume-guarantee triples, 
the environment is the universal environment as restricted by the assumption. To 
simulate that, we provide for each component a universal environment for those 



Towards a Compositional SPIN 


11 



Fig. 6. MER Architecture 


rendezvous actions that are not matched with actions in the provided assump- 
tion. For example, figure 5 shows such a closing environment for client (ul) 

— in an infinite loop, the process performs rendezvous for the actions that are 
unmatched by client (ul) and the process encoding the assumption. Note that 
the same universal environment is used for checking all the queries and oracles 
for one particular component. 

5 Analysis of the MER Resource Arbiter 

5.1 Description 

We experimented with our approach in the context of a model derived from a 
component of the flight software for JPL’s Mars Exploration Rovers (MER) - the 
MER arbiter (see Figure 6). The MER software contains 11 user threads. Each 
thread serves one specific application, such as imaging, controlling the robot 
arm, communicating with earth, and driving. There are 15 shared resources on 
the rover, to which access must be controlled by an arbiter, which is the target of 
our verification. The arbiter module prevents potential conflicts between resource 
requests, and enforces priorities. For instance, it would not make sense to start a 
communication session with earth while the rover is driving. The arbiter module 
consists of about 3,000 lines of source code, written in ANSI standard C. The 
arbiter system has been analyzed with SPIN before, in a non-compositional way 

- a detailed description can be found in [20]. 


5.2 Analysis 

We present here the results of applying compositional analysis for a subproblem 
with 5 users and 5 resources. A design-level Promela model of the arbiter was 
created based on available documentation (3000 lines of code) and was used 
to check several properties. We report here the results for checking a mutual 
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Table 1. Arbiter Analysis Results 


Analysis 

MEM 

State Space 

Time, t model T t compile "f" Run 

Assumption Size 

Monolithic 

544.019 MB 

3.91653e 4- 06 

0.021s 4* 0.854s 4 33.745s 

N/A 

Recursive 

2.622 MB 

1002 

0.038s + 1.142s + 0.032s 

6 ., 12 

Heuristic 

2.622 MB 

2941 

0.044s + 1.392s + 0.021s 

12 


exclusion property (P) stating that communication and driving can not happen 
at the same time. 

The compositional techniques discussed in this paper work on a specific or- 
dering of the components in the system. For the MER system, we ordered the 
user components first as (U x ... C/5) and the arbiter process last as (ARB). As 
described in [8], compositional techniques tend to be sensitive to different de- 
compositions of a system. The reason we selected this particular ordering was 
that part of the project involved experimenting with generating assumptions for 
the arbiter in the absence of an arbiter component. 

We then used the learning tool described in Section 3 to generate automati- 
cally assumptions A x ... A 5 such that: 

(Ai) Ui (P) 

(A 2 ) U 2 (A x ) 

(As) Us (A 2 ) 

(A 4 ) C/4 (As) 

{/*5/ C/5 (A 4 ) 

(true) ARB (As) 

For this purpose, we manually created environments that exercise each com- 
ponent, as described in the previous section. We also specified the interface 
actions to be used for building the assumptions. We experimented with the re- 
cursive technique that we have implemented for handling multiple components 
and with the heuristic approach, that analyzes one component at a time. In 
both cases we were able to compute assumptions for the above premises to hold. 
Hence, according to the compositional rule presented in Section2, we concluded 
that the system Ui\\U 2 \\Us\\U 4 \\Us\\ARB indeed satisfies P. 

The results of analysis applied to the arbiter system are shown in Table 1. 
We used a 2.2 GHz dual processor Pentium with 1 Gb of memory running Red 
Hat Enterprise Linux WS. In the table, row “Monolithic” reports the results 
obtained from the verification of the system in a non-compositional way, and rows 
“Recursive” and “Heuristic” report the results obtained by the application of the 
recursive learning scheme and the heuristic described in Section 3.1, respectively. 
Specifically, we report the memory and time consumed for verification of the 
system. For the compositional techniques, the reported time and memory refer 
to the maximum time or memory consumed to for checking a single premise. 
They do not include the process of generating the assumptions (reported in 
Table 2), but rather the process of applying the assume-guarantee premises once 
the assumptions are available. 
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Table 2. Cost of Assumption Generation 


Analysis 

queries 

oracle 1 

oracle 2 

tSPIN + t Learn 

MEM Learn 

tLTSA 

MEMltsa 

Recursive 

4884 

48 

1 

5646.365s 

1743 K 

42.87 s 

20400K 

Heuristic (Ai) 

852 

12 

1 

818.213s 

508 K 

3.076 s 

4845K 


The reported times are divided into three parts: t mQ dei is the time to create 
a C model from a Promela model, t COTnp u e is the compilation time, and t run 
is the time to run the specific verification task in SPIN. We also report the 
size of the assumptions used for compositional verification. Using the recursive 
algorithm yields assumptions that have 12 states (Ai, A<i and A 3 ) and 6 states 
(A* and As) while the heuristic approach yields assumptions of size 12 for each 
component (for this case study, all the assumptions generated using the heuristic 
approach are the weakest). We need to study further the trade-offs between the 
two learning approaches: the heuristic approach has the advantage that it can 
be used for the analysis of a component in isolation (in the absence of the rest of 
the over- all system, and maybe even before it is available), while the recursive 
approach may yield smaller assumptions (as it is the case here). This is expected 
to happen for some systems, because the recursive approach has knowledge of 
the environment of each component, and may therefore produce stronger (and 
smaller) assumptions. 

The results indicate that compositional verification can achieve significant 
memory savings over non-compositional verification. 

Cost of assumption generation Table 1 reports the results of compositional 
analysis using assumptions that are already available. Let us now analyze the cost 
of building these assumptions using learning based techniques. Table 2 reports 
the results of running the two learning approaches for assumption generation: for 
the recursive approach, we report the number of queries, the number of oracle 
invocations and the total time for running the algorithm (this includes tieam - 
the time of running the Java implementation that makes external calls to SPIN 
- plus tspiN - the total time of running SPIN multiple times for ansewring 
queries and conjectures). For interface generation, we report the same data for 
the generation of an assumption for one component (the results for the' rest of 
the components are similar). Table 2 also reports MEM Learn ~ memory con- 
sumed by our Java Implementation (this does not include the memory consumed 
by a SPIN run - which is reported in Table 1. 

Our experiments indicate a serious time overhead, where a dominant factor 
is the compilation time for queries. For example, there are 852 queries made for 
the generation of the interface specification of component Ui, and the cost of 
running a query is 0.045s + 1.283s 4* 0.011s, where the compilation time 1.283s - 
clearly dominates. 

Therefore we looked into ways of reducing the compilation time overhead for 
queries. In particular, we experimented with the SPIN’S feature that allows for 
the separate compilation of a model and of properties (written as never claims). 
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never { 
do 

: : grant_ul -> break 

: : ! grant __ul && ! grant _u2 ! cancel_ul && ! cancel_u2 

od; 

do 

: : grant _u2 -> break 

:: ! grant _ul && ! grant _u2 && !cancel_ul && !cancel_u2 

od; 

do 

:: ! grant _ul && ! grant _u2 && !cancel_ul && !cancel_u2 
od; 

> 


Fig. 7. A query encoded as a never claim 


Note that never claims can be used not only to define correctness properties, 
but also to restrict the search of the verifier to a user-defined subset of the 
system [19]. It is in the latter fashion that we use never-claims to attempt more 
efficient checking of queries. 

As an example. Figure 7 shows the never claim used for checking a query 
“grantlul; grant!u2; w (the analog of the query in Figure 5). Here grant _ul, 
grant_u2, cancel-ill and cancel_u2 are global boolean flags added to the Promela 
model of a component. They are set to true whenever a corresponding rendezvous 
occurs and ate reset to false on any other action. For example, grant _ul is set 
to true (while all the other flags are reset to false) atomically with grant?ul. 
The reason we use these flags is that SPIN does not allow rendezvous actions 
in never claims. The effect is that the never claim restricts a verification run 
to all the states that conform to the trace (note that the flags need to be reset 
after every system step execution, to make sure that the never claim restricts 
correctly the system). For technical reasons (SPIN does not allow never claims 
and trace assertions to be checked at the same time), we changed the encoding 
for properties (as monitors). The encoding of queries as never claims allows us 
to compile the component model combined with the property only once and to 
compile separately the never claims for each query. Note that the same approach 
can be used for encoding assumptions. 

With this new encoding, we obtained a significant reduction in running time. 
For example, the cost of heuristic interface generation for U\ was reduced by 
a factor of 4 (from 818.213 $ to 185.185s). We expect a similar reduction to be 
obtained for running the recursive algorithm, and even further reduction for the 
separate compilation of assumptions. 

5,3 Discussion 

The implementation described is a first step towards introducing learning-based 
assume guarantee reasoning in the SPIN model checker. The purpose of this 
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work is fast experimentation with the algorithms in the context of examples 
encoded in Promela. We intend to explore several directions for improving the 
performance of to this approach in future stages of the project. 

The current implementation invokes SPIN for each query and for the two 
oracles. This involves creating appropriate Promela files, compiling them and 
running the verification at each step. While this approach works well for small 
examples, for realistic (large) examples, parsing and compiling the Promela files 
at each step is costly in terms of time. We believe that a first step towards a 
better integration will be the creation of specialized algorithms for efficient trace 
simulation (for checking queries) and for checking properties in the presence of 
restricting assumptions; these algorithms should allow for separate compilation 
of models, assumptions and properties. 

We should note that we encountered similar timing overheads with the imple- 
mentation of the learning assume-guarantee approach as a plugin for the LTSA 
model checker [1], as compared to our initial implementation within the core of 
the LTSA tool [10]. In that implementation, we encountered a significant perfor- 
mance overhead due to the fact that the plugin communicates with the LTSA 
by placing descriptions of the models in the Edit tab. As a result, each query 
or conjecture would require parsing and computing the component model The 
avenue we took to solve the problem was to implement our techniques in the core 
of the LTSA and expose them to the LTSA plugins, while keeping the interfacing 
for our assume-guarantee reasoning as an LTSA plugin. As a result, the running 
time of our iterative learning algorithms is low. 

For example, the last two columns in Table 2 show the results of running the 
LTSA implementation for the arbiter case study. The results indicate that an 
implementation directly in SPIN is likely to similarly improve the performance 
significantly. Note that part of the gain of having the learning algorithms run 
within LTSA is that the LTSA can store the results of a particular composition 
(for a component, for example) and use it in the analysis of multiple properties. 
The impact can be great in the evaluation of queries, and it may be worth adding 
this capability in SPIN, for cases where that would be appropriate (when, for 
example, the component state space is manageable). 

A nice feature that the LTSA supports is that the plugin can extend the user 
interface of the tool, and can be invoked from the LTSA’s graphical user interface. 
As a result, the user can easily customize their assume-guarantee problem, i.e., 
select the modules and properties that participate in a compositional proof, as 
well as the rule that is to be applied. In the future, we would like to take a similar 
approach in integrating our techniques using XSpin. To achieve this, we need to 
understand better what mechanisms are available or can be added for achieving 
Spin/XSpin extensions. Ideally, we would like to display all the components (i.e. 
processes) in a Promela specification, and to allow the user to choose which 
components to analyze using assume guarantee reasoning. 
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6 Related Work 

Assume-guarantee reasoning [9, 16, 22, 28] is based on the observation that large 
systems are being build from components and that this composition can be 
leveraged to improve the performance of analysis techniques. To reason formally 
about components in isolation, some form of assumption (either implicit or ex- 
plicit) about the interaction with, or interference from, the environment has to 
be made. Several frameworks have been proposed to support this style of rea- 
soning. For example, the Calvin tool [12] provides support for assume guarantee 
reasoning for the analysis of Java programs, while the Mocha toolkit [3] provides 
support for modular verification of components with requirement specifications 
based on the Alternating-time Temporal logic. However, the practical impact 
of these previous approaches has been limited because they require non-trivial 
human input in defining appropriate assumptions. 

As mentioned, in previous work [10,13], we have developed techniques for 
performing assume-guarantee reasoning of software in a fully automated fash- 
ion. Our techniques target components with message-passing communication - a 
paradigm used in NASA mission critical software (e.g. MER code). The approach 
presented in [10] uses L* to build incrementally appropriate assumption, and it 
forms the basis of the work presented in this paper. Since then, several assume 
guarantee reasoning, frameworks that use L* for learning assumptions have been 
developed - [4] (see also [2]) presents a symbolic approach to assumption leaning, 
while [6, 7] use learning based assume guarantee verification for communicating 
finite state automata specifications extracted from C code. The work presented 
here is a first attempt to introduce automated assume guarantee reasoning in 
SPIN. In the past [27] we have studied the use of assume guarantee reasoning 
in the context of SPIN - however, in that work, the assumptions were provided 
manually by the user. 

A related effort [17] includes a framework for thread-modular abstraction 
refinement, in which assumptions and guarantees are both refined in an iterative 
fashion. The framework applies to programs that communicate through shared 
variables, and uses predicate abstraction techniques for the iterative construction 
of assumptions. 

The problem of generating an assumption for a component is similar to the 
problem of generating component interfaces to deal with intermediate state ex- 
plosion in CRA. Several approaches have been defined for automatically abstract- 
ing a component’s environment to obtain interfaces [8, 23]. These approaches do 
not address the incremental refinement of interfaces, 

A number of machine learning approaches has been investigated recently in 
the context of software verification, with a goal different then ours. One approach 
uses learning for computing the set of reachable states in regular model checking 
[29]. The work in [15] uses the L* to generate a model of a software system 
in a black box fashion; the model then be fed to a model checker for analysis. 
Similarly, [21] presents learning techniques for building software models for ver- 
ification, while a recent approach [24] uses inductive learning to build precise 
abstractions for program analysis. 
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7 Conclusions and Future Work 

In this paper we discussed our initial experience with automated assume guar- 
antee verification based on learning in the context of SPIN* We presented a 
light-weight tool that uses learning to build assumptions incrementally and that 
makes external calls to SPIN to provide all the model checking related answers. 
We discussed the application of the tool for the verification of a realistic soft- 
ware system - the resource arbiter for a space craft - which resulted in significant 
memory gains as compared to traditional monolithic model checking-. 

While this light-weight implementation allows for a quick evaluation of the 
merits of learning based assume guarantee reasoning in SPIN, it may result in 
serious performance overheads - and we discussed in the paper ways of improving 
our implementation. In the future, we plan to work towards a tighter integration 
in SPIN and to investigate how we can further improve the performance of our 
approach. One possible way is to run in parallel the checks for multiple queries. 
We also plan to study how our algorithms extend to alternative communication 
mechanisms (buffered message passing) and to handling liveness properties - 
the work on learning infinitary regular sets [ 26 ] may be a good start in this 
direction. Another issue that we want to investigate is to make a finer distinction 
in our algorithms between the interface actions of a component (i.e. to distinguish 
between channel read and write operations) and to study how this affects our 
approach. 
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